PROV-man: A PROV-compliant toolkit for provenance management

نویسندگان

  • Ammar Benabdelkader
  • Antoine H. C. van Kampen
  • Sílvia Delgado Olabarriaga
چکیده

6 Discoveries in modern science can take years and involve the contribution of large amounts of data, many 7 people and various tools. Although good scientific practice dictates that findings should be reproducible, in 8 practice there are very few automated tools that actually support traceability of the scientific method employed, 9 in particular when various experimental environments are involved at different research phases. Data 10 provenance tracking approaches can play a major role in addressing many of these challenges. These 11 approaches propose ways to capture, manage, and use of provenance information to support the traceability of 12 the scientific methods in heterogeneous environments. PROV is a W3C standard that provides a comprensive 13 model for data and semantics representation with common vocabularies and rich concepts to describe 14 provenance. Nevertheless, it is difficult for domain scientists to easily understand and adopt all the richeness 15 provided by PROV. In this paper we describe the design and implementation of the provenance manager 16 PROV-man, a PROV-compliant framework that facilitates the tasks of scientists in integrating provenance 17 capabilities into their data analysis tools. PROV-man provides functionalities to create and manipulate 18 provenance data in a consistent manner and ensures its permanent storage. It also provides a set of interfaces to 19 serialize and export provenance data into various data formats, serving interoperability. The open architecture 20 of PROV-man, consisting of an API and a configurable database, allows for its easy deployment within 21 existing and newly developed software tools. The paper presents examples illustrating the usage of PROV22 man. The first example illustrates how to create and manipulate provenance data of an online newspaper 23 article using PROV-man. The second example demonstrates and evaluates the PROV-man implementation in a 24 more complex case for collection of provenance data about biomedical data analysis activities that are carried 25 out using a distributed computing infrastructure. 26

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interoperability for Provenance-aware Databases using PROV and JSON

Since its inception, the PROV standard has been widely adopted as a standardized exchange format for provenance information. Surprisingly, this standard is currently not supported by provenanceaware database systems limiting their interoperability with other provenance-aware systems. In this work we introduce techniques for exporting database provenance as PROV documents, importing PROV graphs ...

متن کامل

JSON and its use in Semantic Web

The semantic web has evolved over the current web and aims to provide a web that allows for easy retrieval and accessing of information by both man and machine. It provides for a wide variety of technology stacks , language standards and software components which help both man and machine to access data easily. Intelligent information retrieval and the credibility of data is managed in semantic...

متن کامل

A Software Framework for Data Provenance

Data provenance refers to the historical record of the derivation of the data, allowing the reproduction of experiments, interpretation of results and identification of problems through the analysis of the processes that originated the data. Data provenance contributes to the evaluation of experiments. This paper presents a framework for data provenance using the W3C provenance data model, call...

متن کامل

PROV-O-Viz - Understanding the Role of Activities in Provenance

This paper presents PROV-O-Viz, a Web-based visualization tool for PROV-based provenance traces coming from various sources, that leverages Sankey Diagrams to reflect the flow of information through activities. We briefly discuss the advantages of this approach compared to other provenance visualization tools. PROV-O-Viz has already been used to visualize provenance traces generated by very dif...

متن کامل

D-PROV: Extending the PROV Provenance Model with Workflow Structure

This paper presents an extension to the W3C PROV provenance model, aimed at representing process structure. Although the modelling of process structure is out of the scope of the PROV specification, it is beneficial when capturing and analysing the provenance of data that is produced by programs or other formally encoded processes. In the paper, we motivate the need for such extended model in t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PeerJ PrePrints

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2015